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NOVEL YEAST TWO-HYBRID SYSTEM AND USE THEREOF 


Related Applications 
This application claims priority from U.S. Provisional Application No. 
5 60/259,759 filed on January 4, 2001, which is incorporated herein by reference in its 
entirety. 

Field of the Invention 
The present invention generally relates to methods for detecting protein-protein 
10 interactions, and particularly to two-hybrid systems for detecting protein-protein 
interactions. 


15 proteomics. A number of biochemical approaches have been used to identify interacting 
proteins. These approaches generally employ the affinities between interacting proteins 
to isolate proteins in a bound state. Examples of such methods include 
coimmunoprecipitation and copurification, optionally combined with cross-linking to 
stabilize the binding. Identities of the isolated protein interacting partners can be 

20 characterized by, e.g., mass spectrometry. See e.g.. Rout et al, J. Cell Biol, 148:635- 
651 (2000); Houry et al. Nature, 402:147-154 (1999); Winter et al, Curr. Biol, 7:517- 
529 (1997). A popular approach useful in large-scale screening is the phage display 
method, in which filamentous bacteriophage particles are made by recombinant DNA 
technologies to express a peptide or protein of interest fused to a capsid or coat protein of 


Background of the Invention 
There has been much interest in protein-protein interactions in the field of 
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the bacteriophage. A whole hbrary of peptides or proteins of interest can be expressed 
and a bait protein can be used to screening the library to identify peptides or proteins 
capable of binding to the bait protein. See e,g., U.S. Patent Nos. 5,223,409; 5,403,484; 
5,571,698; and 15,837,500. Notably, the phage display method only identifies those 
5 proteins capable of interacting in an in vitro environment, while the 

coimmunoprecipitation and copurification methods are not amenable to high throughput 
screening. 

The yeast two-hybrid system is a genetic method that overcomes certain 
shortcomings of the above approaches. The yeast two-hybrid system has proven to be a 

10 powerful method for the discovery of specific protein interactions in vivo. See generally, 
Bartel and Fields, eds.. The Yeast Two-Hybrid System, Oxford University Press, New 
York, NY, 1997. The yeast two-hybrid technique is based on the fact that the DNA- 
binding domain and the transcriptional activation domain of a transcriptional activator 
contained in different fusion proteins can still activate gene transcription when they are 

15 brought into proximity to each other. As shown in Figure 1, in a yeast two-hybrid 

system, two fusion proteins are expressed in yeast cells. One has a DNA-binding domain 
of a transcriptional activator fused to a test protein. The other, on the other hand, 
includes a transcriptional activating domain of the transcriptional activator fused to 
another test protein. If the two test proteins interact with each other in vivo, the two 

20 domains of the transcriptional activator are brought together reconstituting the 

transcriptional activator and activating a reporter gene controlled by the transcriptional 
activator. See, e.g., U.S. Patent No. 5,283,173. 

Because of its simplicity, efficiency and reliability, the yeast two-hybrid system 
has gained tremendous popularity in many areas of research. Numerous protein-protein 

25 interactions have been identified using the yeast two-hybrid system. The identified 

proteins have contributed significantly to the understanding of many signal transduction 
pathways and other biological processes. For example, the yeast two-hybrid system has 
been successfully employed in identifying a large number of novel cell cycle regulators 
that are important in complex cell cycle regulations. Using known proteins that are 

30 important in cell cycle regulation as baits, other proteins involved in cell cycle control 
were identified by virtue of their ability to interact with the baits. See generally, Hannon 
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et ai, in The Yeast Two-Hybrid System, Bartel and Fields, eds., pages 183-196, Oxford 
University Press, New York, NY, 1997. Examples of cell cycle regulators identified by 
the yeast two-hybrid system include CDK4/CDK6 inhibitors (e.g., pl6, pl5, pl8 and 
pl9), Rb family members (e.g., pl30), Rb phosphatase (e.g., PPl-a2), Rb-binding 
transcription factors (e.g., E2F-4 and E2F-5), General CDK inhibitors (e.g., p21 and p27), 
CAK cyclin (e.g., cyclin H), and CDK Thrl61 phosphatase (e.g., KAP and CDIl). See 
id "[T]he two-hybrid approach promises to be a useful tool in our ongoing quest for 
new pieces of the cell cycle puzzle." See id at page 193. In another example, the yeast 
two-hybrid system proved to be a powerful approach in analyzing the yeast pheromone 
response pathway, a complex multistep signal transduction process in haploid yeast cell 
mating. See generally, Sprague et al\ in The Yeast Two-Hybrid System, Bartel and 
Fields, eds., pages 173-182, Oxford University Press, New York, NY, 1997. As 
described in Sprague, various genes were isolated from mutant yeast strains having 
altered pheromone response patterns. However, it was not clear how the proteins 
encoded by these genes function in the pheromone response pathway. The yeast two- 
hybrid system was utilized to test such proteins and mutant forms thereof for their ability 
to interact with each other. As a result, new insights and better understandings of the 
complex process were achieved. See id. 

The classic yeast two-hybrid system depends on gene activation in yeast nucleus 
and has generally required that specific protein-protein interactions between fusion 
proteins occur within the nucleus of yeast cells. Thus, although the conventional yeast 
two-hybrid system has been used successfully in the discovery of numerous protein 
interactions, its usefulness may be limited when it is used in detecting those protein- 
protein interactions that require non-nuclear environment. For example, many cell 
surface proteins and their ligands contain disulfide bonds, which can be disrupted under 
the intracellular reducing conditions. Additionally, posttranslational protein 
modifications, particularly glycosylation, typically would preclude the nuclear 
localization of the modified proteins. 

Cytosolic and cell surface protein-protein interactions play major roles in normal 
cellular functions and biological responses. In particular, many cytosolic and cell surface 
protein-protein interactions are involved in disease pathways. For example, attacks by 
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pathogens such as viruses and bacteria on mammalian cells typically begin with 
interactions between viral or bacterial proteins and mammalian cell surface proteins. 
Therefore, there is a need in the art for improved methods that can be used to efficiently 
detect cytosolic and cell surface protein-protein interactions. 

Summarv of the Invention 
This invention provides a versatile and sensitive yeast-based assay system for 
detecting protein-protein interactions that circumvents the above-described limitations 
inherent in prior art methods. Particularly, the present invention utilizes the so-called 
inteins, which are peptide sequences capable of directing protein trans-splicing. An 
intein is an intervening protein sequence in a protein precursor that is excised from the 
protein precursor during protein splicing. Protein splicing results in the concomitant 
ligation of the flanking protein fragments, i.e., the exteins, with a native peptide bond, 
thus forming a mature extein protein and the free intein. It is now known that inteins 
incorporated into non-native precursors can also cause protein-splicing and excision of 
the inteins. In addition, an N-terminal intein fragrnent in a fusion protein and a C- 
terminal intein fragment in another fusion protein, when brought into contact with each 
other, can bring about trans-splicing between the two fusion proteins. Thus, in 
accordance with the present invention, two-hybrid fusion proteins are provided in yeast 
cells. One has a first test polypeptide and an N-terminal intein fragment or N-intein, and 
tfie other has a second test polypeptide and a C-terminal intein fragment or C-intein. In 
addition, one or both fusion proteins may have a reporter that undergoes detectable 
changes upon trans-splicing of the fusion proteins. If the first and second test 
polypeptides interact with each other, thus bringing the N-intein and C-intein to close 
proximity, protein trans-splicing takes place. As a result, the fusion proteins are spliced, 
causing detectable changes in the reporter. Thus, by detecting the changes in the 
reporter, interactions between two test polypeptides can be determined. 

Unlike the traditional two-hybrid systems, the intein-based yeast two-hybrid 
system of the present invention does not require that the interacting proteins be 
transported into cell nucleus. Thus, the system is useful in determining protein-protein 
interactions that require a specific cellular environment. For example, the system can be 
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employed to detect interactions between nuclear proteins, between cytosolic proteins, and 
between membrane or extracellular proteins. 

Additionally, protein trans-splicing mediated by the N-intein and C-intein is 
independent of other cellular factors and does not require the action of additional proteins 
such as proteases. This makes the assay system of the present invention more reliable 
and easier to perform as compared to the assay methods known in the art for detecting 
protein-protein interactions. 

Another distinct feature of the intein-based yeast assay is that the detection of 
protein-protein interaction is based on the occurrence of protein trans-splicing events, 
which typically are associated with protein cleavage and result in new protein structures 
and functions. Thus, the intein-based assay is well-suited to exploit the numerous direct 
and indirect methods available in the art for detecting changes in protein structures and 
functions. Because the intein-based assay can accommodate these numerous detection 
methods, there is great flexibility in choosing methods that are optimal for a particular 
condition. 

Furthermore, in contrast to prokaryotes-based systems, the intein-based yeast two- 
hybrid system of the present invention utilizes eukaryotic yeast cells in which 
mammalian proteins, particularly human proteins, can be easily expressed with high 
fidelity and efficiency. In addition, the cell compartmental localization of manmialian 
proteiiis or fusion proteins containing manmialian components is more likely to resemble 
their native state in yeast cell than in bacteria. Thus, the yeast-based system is a much 
more reliable and versatile system. It is amenable to protein-protein interactions that are 
not detectable by prokaryotes-based systems while producing less false positive protein- 
protein interactions that do not naturally occur. 

Briefly, two fusion proteins are expressed in a yeast cell and allowed to interact 
with each other. One of the two fusion proteins includes an N-intein and a first test 
polypeptide, and the other fusion protein includes a C-intein and a second test 
polypeptide. One or both of the two fusion proteins have an inactive reporter capable of 
being converted to an active reporter upon trans-splicing through the N-intein and the C- 
intein. The change in the active reporter level is determined. An increase in the amount 
of the active reporter would indicate that the first and second test polypeptides interact 


Attorney Docket No. 14IR.01 


with each other through, e.g., binding affinity, to result in the trans-splicing of the two 
fusion proteins mediated by the N-intein and the C-intein. Preferably, the N-intein and 
C-intein are not associated with each other and do not exhibit any significant binding 
affinity to each other. Nor do they associate with or bind to the inactive reporter or test 
polypeptides in the fusion proteins. 

In one embodiment, the inactive reporter can be a polypeptide linked to one of the 
fusion proteins, and is cleaved off into a free form from the fusion protein upon protein 
trans-splicing. The reporter polypeptide can be selected and the fusion proteins can be 
designed such that the precursor form of the polypeptide is inactive while the free 
reporter released from the fusion protein is active, i.e., is detectable directly or indirectly. 

In another embodiment, one of the two fusion proteins has a nonfunctional 
portion of a reporter polypeptide linked to the N-terminus of the N-intein. The other 
fusion protein comprises a distinct but similarly nonfunctional portion of the same 
reporter polypeptide linked to the C-terminus of the C-intein. Upon trans-splicing 
between the two fusion proteins through the N- and C-inteins, the two inactive reporter 
polypeptides are ligated together with a peptide bond, thereby forming an active reporter - 
protein, which is detectable directly or indirectly. 

To express the above-described fusion proteins, chimeric genes encoding the 
fusion proteins are introduced into a host cell. The amount of the active reporter protein 
in the host cell is then determined. In a preferred embodiment, a first chimeric gene 
encoding one of the two fusion proteins is expressed in a haploid Saccharomyces cell of a 
mating type and a second chimeric gene encoding the other fusion protein is expressed in 
a haploid Saccharomyces cell of mating type a. The two cells are mated to form a 
diploid cell, and any change in the amount of the active reporter protein in the diploid is 
then determined. 

In a specific embodiment, the expression of one or more of the chimeric genes 
can be made inducible, e.g., by placing the genes under the control of an inducible 
promoter, such that one or more of the fusion proteins are produced when the host cell is 
subject to a predetermined condition. 

In addition, the assay can also be conducted in the presence of a third polypeptide. 
In this manner, the interaction between the first and second test polypeptides can be 
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detected if the interaction requires the presence of the third polypeptide. The third 
polypeptide may be a protein having affinity to either the first or second test polypeptides 
or both. Alternatively, the third polypeptide can modify one or both test polypeptides, 
e.g., by phosphorylation, glycosylation, and the like. 
5 The techniques used for monitoring the occurrence of protein trans-splicing 

events and detecting an active reporter will depend on the inactive reporter used and the 
active reporter derived therefrom. The system of the present invention can be designed 
such that an active reporter can be detected based on changes in protein sizes or other 
properties, or activation of certain protein functions. For example, detection of an active 

10 reporter can be based on cell viability assays, color assays, and the like. 

In accordance with another aspect of the present invention, a kit for detecting 
protein-protein interactions is provided, which includes a first expression vector 
containing a first chimeric gene having operably linked in. the same open reading frame: 
(a) a sequence encoding a first inactive reporter polypeptide; (b) a coding sequence for N- 

15 intein; and (c) a first multiple cloning site. The kit also includes a second expression 
vector containing a second chimeric gene having operably linked in the same open 
reading frame: (a) a second multiple cloning site; (b) a coding sequencing for C-intein; 
(c) a sequence encoding a second inactive reporter polypeptide, wherein ligation between 
the C-terminus of said first inactive reporter polypeptide and the N-terminus of said 

20 second inactive reporter polypeptide forms an active reporter. Preferably the kit further 
includes a yeast cell deficient in the active reporter protein. 

The foregoing and other advantages and features of the invention, and the manner 
in which the same are accomplished, will become more readily apparent upon 
consideration of the following detailed description of the invention taken in conjunction 

25 with the accompanying examples and drawings, which illustrate preferred or exemplary 
embodiments. 

Brief Description of the Drawings 
Figure 1 is an illustration of the classic yeast two-hybrid system known in the art; 
30 Figure 2A illustrates a genetic selection process for selecting N-inteins and C- 

inteins that do not interact with each other; 
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Figure 2B shows a process for verifying that the selected non-interacting N-intein 
and C-intein are capable of mediating protein trans-splicing; 

Figures 3A-3F are diagrams illustrating the fusion proteins in different 
embodiments of the present invention; 

Figure 4 is a drawing demonstrating the use of the protein encoded by the URA3 
gene as a reporter protein in one embodiment of the present invention; 

Figure 5 shows an embodiment of the present invention in which a transcriptional 
activator is used as an active reporter which drives the expression of the selection marker 
gene URA3\ 

Figure 6 is a diagram illustrating an embodiment of the present invention in which 
a modifying enzyme is expressed in a multi-hybrid system and interaction between the 
modified proteins is detected; 

Figure 7 illustrates four different vector constructs that allow expression of 
different fusion proteins used in the intein-based two-hybrid systems demonstrated in the 
Example; 

Figure 8 shows some successful testing results of the intein-based two-hybrid 
systems demonstrated in the Example; 

Figure 9 illustrates the protein-protein interactions that give rise to functional 
Ura3p in the intein-based two-hybrid systems demonstrated in the Example. 

Detailed Description of the Invention 
As used herein, the terms "polypeptide," "protein," and "peptide" are used 
interchangeably to refer to amino acid chains in which the amino acid residues are linked 
by covalent peptide bonds. The amino acid chains can be of any length of at least two 
amino acids, including full-length proteins. Unless otherwise specified, the terms 
"polypeptide," "protein," and "peptide" also encompass various modified forms thereof, 
including but not limited to glycosylated forms, phosphorylated forms, etc. 

The term "test polypeptide" means a chemical compound, preferably an organic 
compound, to be tested in the present invention to determine its abihty to interact with 
another chemical compound. Test polypeptides may include various forms of organic 
compounds, or combinations or conjugates thereof. In one embodiment, the test 
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polypeptides preferably are polypeptides, in which case the test polypeptides are termed 
"test polypeptides" or "test proteins." 

The term "fusion protein" refers to a non-naturally occurring hybrid or chimeric 
protein having two or more distinct portions covalently linked together, each portion 
being or being derived from a specific molecule. 

As used herein, the term "interacting" or "interaction" means that two domains or 
independent entities exhibit sufficient physical affinity to each other so as to bring the 
two "interacting" domains or entities physically close to each other. An extreme case of 
interaction is the formation of a chemical bond that results in continual, stable proximity 
of the two domains. Interactions that are based solely on physical affinities, although 
usually more dynamic than chemically bonded interactions, can be equally effective at 
co-localizing independent entities. Examples of physical affinities and chemical bonds 
include but are not limited to, forces caused by electrical charge differences, - 
hydrophobicity, hydrogen bonds, van der Wals force, ionic force, covalent linkages, and 
combinations thereof. The state of proximity between the interacting domains or entities 
may be transient or permanent, reversible or irreversible. In any event, it is in contrast to 
and distinguishable from contact caused by natural random movement of two entities. 
Typically although not necessarily, an "interaction" is exhibited by the binding between 
the interacting domains or entities. Examples of interactions include specific interactions 
between antigen and antibody, ligand and receptor, and the like. 

An "interaction" between two protein domains, fragments or complete proteins 
can be determined by a number of methods. For example, an interaction can be 
- determined by functional assays such as the two-hybrid systems. Protein-protein 
interactions can also be determined by various biophysical and biochemical approaches 
based on the affinity binding between the two interacting partners. Such biochemical 
methods generally known in the art include, but are not limited to, protein affinity 
chromatography, affinity blotting, inmiunoprecipitation, and the like. The binding 
constant for two interacting proteins, which reflects the strength or quality of the 
interaction, can also be determined using methods known in the art. See Phizicky and 
Fields, Microbiol Rev., 59:94-123 (1995). 
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As used in the present disclosure, the term "reporter" means a molecule or a 
moiety or domain thereof that can be used as a marker for the determination of the 
occurrence of protein trans-splicing. An "inactive reporter" is a form of the reporter that 
is not detectable by a particular detection means, while an "active reporter" is a form of 
5 the reporter that is detectable by that detection means. It should be recognized that the 
terms "detectable" and "not detectable" are used herein in a relative sense. In essence, 
there should be a measurable or detectable change in the reporter, either quantitative or 
qualitative, upon intein-based trans-splicing. For purposes of the present discussion, 
"active reporters" include both reporters that are directly detectable and those reporters . 

10 that are detectable indirectly. One example of an indirectly detectable active reporter is a 
transcription activator that can activate the transcription of a detectable gene and thus 
cause the synthesis of a detectable protein encoded by the detectable gene. 

Many reporters are known in the art and the selection and application of any of 
those reporters to the present invention should be apparent to a skilled artisan apprised of 

15 the present disclosure. Examples of reporters suitable for use in a yeast system or other 
systems include, but are not limited to: P-galactosidase (P-Gal) encoded by the LacZ gene 
which converts white X-Gal into a product with a blue color; the product of the CYH2 
gene, which confers sensitivity to cycloheximide (CYH); proteins encoded by the 
auxotrophic genes URA3, HIS3, LEU2, and TRPl; and green fluorescent protein (GFP), 

20 which can be sorted by flow-activated cell sorting (FACS). See Cubitt et al, Trends 
fiioc/iem. 5d., 20:448-455 (1995). 

Typically, an inactive reporter can be converted to an active reporter upon trans- 
splicing in the method of this invention. For example, a molecule when fused to a 
construct of the present invention may not be detectable and thus is referred.to as "an 

25 inactive reporter." The fused form may be released from the fusion protein into a free 
form of the molecule that is detectable. This detectable free form is referred to as an 
"active reporter," which is in contrast to the "inactive" undetectable bound form of the 
reporter. In another example, two inactive reporters are fused to an N-intein and a G- 
intein, respectively, and upon trans-splicing, the two inactive reporters are Hgated 

30 together forming a detectable active reporter. For this purpose, fragments of an active . 
reporter that are not detectable can also be referred to "inactive reporter." Thus, an N- 

10 


AttQmcyDQgkgt Nq. 141g.01 

terminal fragment of a reporter protein is fused to an N-intein and a C-terminal fragment 
of the reporter protein is fused to a C-intein. Upon protein trans-splicing mediated by the 
N- and C-intein, the N-terminal and C-terminal fragments can be ligated, thereby forming 
a full-length detectable active reporter protein. 
,5 As is known in art, inteins are intervening protein sequences in protein precursors 

which are exercised out, or removed, from the protein precursors during protein splicing. 
The protein sequences flanking inteins are called exteins. The excision of an intein is 
associated with the concomitant ligation of the N-extein (the protein- sequence to the N- 
terminus of the intein) and the C-extein (the protein sequence to the C-terminus of the 

10 intein) through a native peptide bond thus forming a mature extein protein and a free 
intein. See Perler et al. Nucleic Acids Res., 22: 1 125-1 127 (1994). The entire protein 
splicing process is autocatalyzed by the intein and is believed to be independent of 
specific host cell factors. Indeed, intein-based protein splicing has been shown to occur 
in vitro as well as in heterologous organisms. See Perler et al. Cell, 92:1-4 (1998). 

15 Intein-based protein splicing has also been shown to be independent of the native 
flanking exteins. Hybrid protein sequences containing inteins fused to non-native 
polypetide sequences are able to undergo protein splicing to excise the inteins and ligate 
the flanking polypeptide sequences. See e.g., Evans et a/., /. Biol Chem., 274:3923-3926 
(1999); Evans et a/., /. BioL Chem., 275:9091-9094 (2000). 

20 Certain amino acid sequences within an intein sequence are irrelevant to protein 

splicing: Based on sequence comparison and structural analysis, it is now known that the 
residues responsible for splicing are the intein N-terminal 100 amino acids, 
approximately, and the intein C-terminal 50 amino acids, approximately. See e.g., Duan 
etal, Cell, 89:555-564(1997), Hall etai, Cell, 91:85-97 (1997); Klabunde et ai. Nature 

25 Struct, BioL 5:31-36 (1998). Indeed, a functional mini-intein can be produced by 
deleting the centrally located irrelevant amino acid sequence leaving the N-terminal 
sequence of about 100 amino acids fused directly to the C-terminal sequence of about 50 
amino acids. See e.g., Wu et al, Biochim. Biophys. Acta., 1387:422-32 (1998). In 
addition, inteins have been identified that can mediate trans-splicing even when the N- 

30 terminal intein sequence and the C-terminal intein sequence are in different proteins. See 
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id.; see also, Shingledecker a/.. Gene, 207:187-195 (1998); Evans etal, J, Biol 
Chem,, 274:3923-3926 (1999); Evans etai, / BioL Chem., 275:9091-9094 (2000). 

The present invention utilizes the trans-splicing capability of inteins to provide a 
method for detecting protein-protein interactions in yeast. Thus, in accordance with the 
present invention, two fusion proteins are provided in a yeast cell: one has a first test 
polypeptide and an N-intein, and the other has a second test polypeptide and a C-intein. 
In addition, one or both fusion proteins have a reporter that undergoes detectable changes 
upon intein-mediated trans-splicing of the fusion proteins. If the first and second test 
polypeptides interact with each other and bring the N-intein and C-intein into close 
proximity to each other, protein trans-splicing takes place. As a result, the fusion 
proteins are trans-spliced and/or re-ligated causing detectable changes in the reporter. By 
detecting the changes in die reporter, the interaction between two test polypeptides can be 
determined. 

As used herein, the terms "N-intein" and "C-intein" refer to an N-terminal and a 
C-terminal portion of an intein, respectively. An N-intein itself alone cannot direct 
protein splicing, and likewise, a C-intein itself alone is incapable of catalyzing protein 
splicing. However, when an N-intein and a C-intein are placed in close proximity, they 
are capable of acting in concert to catalyze protein trans-splicing. Conserved intein 
motifs have been identified in many inteins. Typically, an intein includes an N-terminal 
splicing region having sequence motifs designated A, N2, B, and N4, an endonuclease or 
linker domain region having sequence motifs designated C, D, E, and H, and a C-terminal 
splicing region having sequence motifs designated F and G. See Pietrokovski, Protein . 
5d., 3:2340-2350 (1994); Pietrokovski, Protein Set, 7:64-71 (1998). Thus, in a specific 
embodiment, N-intein encompasses at least motifs A, N2, B, and N4, while C-intein 
includes at least motifs F and G. Typically, "N-intein" is an amino acid sequence 
matching the N-terminal sequence of about 90 to 1 10 amino acids of an intein, while "C- 
intein" is an amino acid sequence matching the C-terminal sequence of about 30 to 50 
amino acids of an intein. A skilled artisan will recognize that optimal sequences of N- 
inteins and C-inteins can be determined by routine trial and error experiments. In 
addition, it should be understood that the terms "N-intein" and "C-intein" also encompass 
non-native or modified amino acid sequences that are derived from an N-terminal or C- 
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terminal portion of an intein, respectively, e.g., modified or mutein forms containing 
amino acid insertions, deletions, or substitutions. 

Protein precursors containing inteins have been found in all three life domains: 
archaea, bacteria, and eucarya. A large number of inteins exist in bacteria and a few 
found in yeast. See Perler et al, Nucleic Acids Res,, 28:1 344-5 (2000); see also InBase, 
the New England Intein Database, at http://www.neb.com/neb/inteins.html The N-intein 
and C-intein used in the fusion proteins of the present invention can be selected according 
to the naturally occurring intein sequences. Alternatively, the naturally occurring intein 
sequences can be modified by deleting, inserting, or substituting amino acids to generate 
desirable properties in the N- and C-intein. 

Some naturally occurring native N-inteins and C-inteins are known to interact 
with each other. This may cause undesireable background and could yield a high 
" frequency of false positives. To mdnimize the background and increase the assay 
sensitivity in the present invention, it is preferred to use an N-intein and a C-intein that do 
not substantially interact with each other. That is, they do not exhibit sufficient physical 
affinity to each other or form chemical bonds between them so as to bring them 
physically close to each other to cause substantial protein trans-splicing. Such non- 
interaction will be. operationally defined as an inability of an N-intein/C-intein pair to 
yield an active reporter when fused to test polypeptides known to have no affinity for one 
another. 

If the N-intein and C-intein have relatively high affinity to each other, the N- 
intein and C-intein can be mutated to minimize their interaction. Alternatively, as will be 
described in detail below, competitive inhibitors of the reporters can be applied to 
minimize background detection signals. In this way, the detection signal from the active 
reporter produced by the interaction between the test proteins will be sufficiently greater 
than the background detection signal such that the interaction between the test proteins 
can be distinguished from the background interaction between the N-intein and C-intein. 

Various trans-splicing assays may be used in combination with recombinant 
mutagenesis techniques to generate an N-intein and a C-intein that do not interact with 
each other and yet are capable of catalyzing protein trans-splicing when brought to 
proximity to each other. Conveniently, a genetic selection assay can be employed. For 
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example, as shown in Figure 2A, two chimeric genes can be prepared using standard 
recombinant DNA technologies. One chimeric gene encodes a fusion protein containing 
the N-terminal fragment of a reporter protein fused, at its C-terminus, to the N- terminus 
of an N-intein. The other chimeric gene encodes a fusion protein having a C-intein fused, 
at its C-temiinus, to the N-terminus of the C-terminal fragment of a reporter protein. The 
N- and C-terminal fragments of the reporter protein should not interact with each other or 
with N- or C- intein. They can be in any length so long as an active reporter protein can 
be generated when they are ligated together through protein trans-splicing mediated by 
the N- and C-intein. The genetic selection assay can be performed in any suitable host 
cells, preferably conducted in yeast cells. The two chimeric genes are introduced to a 
host cell for the expression of the two fusion proteins. Alternatively, in the case of yeast 
cells, they can be introduced into two yeast cells having different mating types, which.are 
subsequently mated. If the N-intein and C-intein thus expressed interact with each other, 
an active reporter will be detectable in the host cell. To obtain N-inteins and C-inteins 
that do not interact with each other, the DNA coding regions for the N-intein and C-intein 
are mutated using standard mutagenesis techniques to create changes in the amino acid 
sequences of the N- and C-intein. The thus generated mutant chimeric genes are then 
introduced into host cells for the genetic selection assay described above. If the active 
reporter is cytotoxic or cytostatic, one can select for those yeast cells that express mutant 
N- and C-inteins that fail to interact spontaneously. Finally, both the N- and C-extein 
fusion proteins can be C-terminally tagged with an epitope to allow immunologic 
confirmation of expression of the non-interacting intein mutants. In this manner, randoni 
mutations can be caused in the N- and C-intein and those mutant N-inteins and C-inteins 
that do not interact with each other are selected. See Figure 2 A. 

Besides random mutagenesis, site-directed mutagenesis can also be used to 
change amino acid sequences in wild-type N- and C-inteins in predetermined manners. 
For example, amino acid sequences can be modified to create consensus sequences for 
phosphorylation by protein kinases or for glycosylation. Alternatively, certain amino 
acids in wild-type N- and C-intein sequences can also be chemically modified, e.g., by 
incorporating non-natural amino acids or by chemically linking certain moieties to amino 
acid side chains. 
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The selection of non-interacting N-intein and C-intein can also be done in an in 
vitro assay. For example, fusion proteins containing wild-type or mutated N- or C-inteins 
expressed from the above-described chimeric genes can be purified by standard 
chromatographic or affinity techniques or prepared in crude cell extracts. Fusion protein 
5 pairs (in which one contains an N-intein and the other contains a C-intein) are then mixed 
and incubated together in vitro under appropriate conditions to promote protein splicing 
as described below. 

The thus selected N- and C-inteins are further tested for their ability to catalyze 
protein trans-splicing in a yeast cell. For this purpose, the selected chimeric genes 

10 containing desirable N- and C-intein coding sequences are further modified. Figure 2B 
illustrates an example of this verification process. Essentially, a pair of new chimeric 
genes are constructed and introduced into a yeast cell for expressing a pair of fusion 
proteins. One chimeric gene encodes a fusion protein containing the above-described N- 
terminal fragment of a reporter protein fused, at its C-terminus, to the N-terminus of an 

15 N-intein, and a bait protein fused to the C-terminus of the N-intein. The other chimeric 
gene encodes a fusion protein having a C-intein fused, at its C-terminus, to the N- 
terminus of the above-described C-terminal fragment of a reporter protein, and a prey 
protein fused to the N-terminus of the C-intein. The bait protein and prey protein are 
known to interact with each other. Any pair of interacting proteins known in the art can 

20 be used for this purpose, such as the interacting pairs: FKBP12 and TGPPRI; FKBR12 
and FRAP; thyroid hormone receptor a and nuclear corepressor 1; Ras and Raf. See 
Huang and Schreiber, Proc Natl Acad Sci USA, 94: 13396-401 (1997); Rossi et al, Proc 
Natl Acad Sci USA, 94:8405-10 (1997); Chen and Evans, Nature, 377:454-7 (1995); 
Pelletier etal, Proc Natl Acad Sci USA, 95:12141-6 (1998). After the new chimeric 

25 genes are expressed in a yeast cell to produce the fusion proteins, the active reporter is 
detected to determine whether trans-splicing has occurred. In this manner, N-inteins and 
C-inteins that do not interact with each other but are nevertheless capable of mediating 
protein trans-splicing in yeast cells when they are brought into proximity can be 
identified. 

30 Thus, in accordance with present invention, two fusion proteins can be provided 

in yeast cells, one having an N-intein and a first test polypeptide and the other having a 
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C-intein and a second test polypeptide. At least one of the two fusion proteins has an 
inactive reporter capable of being converted to an active reporter upon trans-splicing 
mediated by the N-intein and the C-intein. The tv^o fusion proteins are then mixed and 
incubated together or allowed to contact with each other in other manners under 
5 appropriate conditions: Each of the two fusion proteins should be designed such that the 
interaction between the first and second test polypeptides can be determined by detecting 
or measuring the active reporter in the assay system. 

Optionally, a control assay is conducted in parallel to the detection assay. 
Typically, in the control assay, the potential interaction between the two test polypeptides 

10 being assayed in the detection assay of this invention is pre-empted, eliminated or 

inhibited. For example, in one control assay, control fusion proteins are used, in which 
two known polypeptides that do not interact with each other are included in lieu of the 
first and second test polypeptides, respectively. Because the known polypeptides in the 
control fusion proteins do not interact with each other, any active reporter signal in the 

15 control assay is a background signal. Alternatively, in another control assay, the control 
fusion proteins do not contain the first or second test polypeptides. In other words, the 
control fusion proteins are different from those in a detection assay in that the control 
fusion proteins do not contain test polypeptides. Thus, any active reporter signal in the 
control assay would not be the result of interaction between the test polypeptides. 

20 Preferably, a control assay utilizes the same two fusion proteins as those in a 

detection assay, which contain a first and a second test polypeptide, respectively. 
However, the control assay is conducted in the presence of an inhibitor that interferes 
with the interaction between the first and second test polypeptides in the fusion proteins. 
Typically, the inhibitor is an agent that interacts with one or both of the two test 

25 polypeptides in a manner such that the interaction between the two test polypeptides is 
disrupted, and as a result, the active reporter that would normally be formed upon 
interaction between the two test polypeptides is not produced. Conveniently, one of the 
two test polypeptides is used as an inhibitor. Such an agent should be in a free non- 
hybrid form or in a hybrid form that will not cause the formation of the active reporter 

30 upon an interaction between this hybrid form and the other test polypeptide in one of the 
two fusion proteins. For example, if the test polypeptide used as an inhibitor is a protein. 
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it can be conveniently expressed from an expression vector containing a gene sequence 
encoding the protein. 

The level of detectable active reporter in the control assay is compared to that in 
the detection assay. As a result, positive signals indicating specific interactions in the 
detection assay can be confirmed and distinguished from background signals inherent in 
the assay system. A control assay is especially useful when the N-intein and C-intein 
used in the fusion proteins can interact with each other. . 

A control assay can also be conducted siniultaneously with the testing assay in the 
same host cell. In this case, the third and fourth fusion proteins described above should 
contain a second reporter different than that in the first and second fusion proteins such 
that the inability of the third and fourth fusion proteins to interact with each other can be 
demonstrated by detecting the presence or absence of an active form of the second 
reporter. 

Alternatively, measures can be taken to reduce background signals. For example, 
in the case when cells of a His' (//w-deficient) yeast strain are used as host cells and the 
HIS3 gene product (imidazole glycerol phosphate dehydratase) is used as a reporter, the 
compound 3-amino-l,2,4-triazole (3-AT) can be added to the medium on which the yeast 
cells in the assay are grown. 3-aminotriazole (3-AT) specifically inhibits the H1S3- 
encoded enzyme imidazole jglycerol phosphate dehydratase, which is required in yeast for 
the synthesis of the amino acid histidine. See Kishore et al, Ann, Rev. Biochem,, 57:627- 
663 (1988). As a result, a strong signal is required to confirm actual interaction between 
the test proteins. See Durfee et al. Genes Dev., 7:555-569 (1993). Selection for 
progressively stronger reporter signaling can be achieved with progressively higher 
concentrations of 3-AT in the selection medium. Thus, with sufficiently high 3-AT 
concentrations, background growth on histidine-deficient media can be suppressed to 
allow use of an inherently "noisy" system. 

The detection assay in accordance with the present invention is preferably 
conducted in a yeast cell. In this respect, fusion proteins can be recombinantly expressed 
in a host cell by introducing into the host cell chimeric genes encoding the fusion 
proteins. For this purpose, the expression vectors and host cells used in various two- 
hybrid systems developed in the art may be adapted and incorporated in the assays. Such 
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two-hybrid systems are generally disclosed in U.S. Patent Nos. 5,283,173; 5,525,490; 
5,585,245; 5,637,463; 5,695,941; 5,733,726; 5,776,689; 5,885,779; 5,905,025; 6,037,136; 
6,057,101; 6,114,111; andBartel and Fields, eds., The Yeast Two-Hybrid System, Oxford 
University Press, New York, NY, 1997, all of which are incorporated herein by reference. 

Typically, two chimeric genes are prepared encoding two fusion proteins as 
described above containing an N-intein and a C-intein, respectively. For the purpose of 
convenience, the two test polypeptides whose interaction is to be determined are referred 
to as "bait polypeptide" and "prey polypeptide," respectively. The chimeric genes 
encoding the fusion proteins containing the bait and prey polypeptides are termed "bait 
chimeric gene" and "prey chimeric gene," respectively. Typically, a "bait vector" and a 
"prey vector" are provided for the expression of a bait chimeric gene and a prey chimeric 
gene, respectively. 

Many types of vectors can be used for the present invention. Methods for the 
construction of bait vectors and prey vectors should be apparent to skilled artisans in the 
art apprised of the present disclosure. See generally, Current Protocols in Molecular 
Biology, Vol. 2, Ed. Ausubel, et al, Greene Publish. Assoc. & Wiley Interscience, Ch. 
13, 1988; Glover, DNA Cloning, Vol. H, IRL Press, Wash., D.C., Ch. 3, 1986; Bitter, et 
al, in Methods in Enzymology 153:516-544 (1987); The Molecular Biology of the Yeast 
Saccharomyces, Eds. Strathem et al. Cold Spring Harbor Press, Vols. I and II, 1982; and 
Rothstein in DNA Cloning: A Practical Approach, Vol. 11, -Ed. DM Glover, IRL Press, 
Wash., D.C., 1986. 

Generally, the bait and prey vectors may include a promoter operably linked to a 
chimeric gene for the transcription of the chimeric gene, an origin of DNA replication for 
the replication of the vectors in yeast cells and a replication origin for the amplification of 
the vectors in, e.g., E. coli, and selection marker(s) for selecting and maintaining only 
those yeast cells harboring the vectors. Additionally, the vectors preferably also contain 
inducible elements, which function to control the expression of the chimeric gene. 
Making the expression of the chimeric genes inducible and controllable is especially 
important in the event that the fusion proteins or components thereof are toxic to the host 
yeast cells. Other regulatory sequences such as transcriptional enhancer sequences and 
translation regulation sequences (e.g., Shine-Dalgamo sequence) can also be included. 
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Termination sequences such as the bovine growth hormone, SV40, lacZ and AcMNPV 
polyhedral polyadenylation signals may also be operably linked to the chimeric gene. An 
. epitope tag coding sequence for detection and/or purification of the fusion proteins can 
also be incorporated into the expression vectors. Examples of useful epitope tags 
5 include, but are not limited to, influenza virus hemagglutinin (HA), Simian Virus 5 (V5), 
polyhistidine (6xHis), c-myc, lacZ, GST, and the like. Proteins with' polyhistidine tags 
can be easily detected and/or purified with Ni affinity columns, while specific antibodies 
to many epitope tags are generally commercially available. Bait and prey vectors may 
also contain components that direct the expressed protein extracellularly or to a particular 

10 intracellular compartment. Signal peptides, nuclear localization sequences, endoplasmic 
reticulum retention signals, mitochondrial localization sequences, myristoylation signals, 
palmitoylation signals, and transmembrane sequences are example of optional vector 
components that can determine the destination of expressed proteins. The vectors can be 
introduced into host yeast cells by any techniques known in the art, e.g., by direct DNA 

15 transformation, microinjection, electroporation, viral infection, lipofection, gene gun, and 
the like. The bait and prey vectors can be maintained in yeast cells in an 
extrachromosomal state, i.e., as self-replicating plasmids or viruses. Alternatively, one or 
both vectors can be integrated into chromosomes of the host yeast cells by conventional 
techniques such as selection of stable cell lines or site-specific recombination. 

20 In accordance with the present invention, the fusion proteins are expressed in a 

yeast expression system using yeasts such as Saccharomyces cerevisiae, Hansenula 
polymorpha, Pichia pastoris, and Schizosaccharomyces pombe as host cells. The 
expression of recombinant proteins in yeasts is a well developed area, and the techniques 
useful in this respect is disclosed in detail in The Molecular Biology of the Yeast 

25 Saccharomyces, Eds. Strathem et aL, Vols. I and II, Cold Spring Harbor Press, 1982; 
Ausubel etal. Current Protocols in Molecular Biology, New. York, Wiley, 1994; and 
Guthrie and Rnk, Guide to Yeast Genetics and Molecular Biology, in Methods in 
Enzymology, Vol. 194, 1991, all of which are incorporated herein by reference. Sudbery, 
Curr. Opin. Biotech., 7:517-524 (1996) reviews the success in the art in expressing 

30 recombinant proteins in various yeast species; the entire content and references cited 
therein are incorporated herein by reference. In addition, Bartel and Fields, eds.. The 
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Yeast Two-Hybrid System, Oxford University Press, New York, NY, 1997 contains 
extensive discussions of recombinant expression of fusion proteins in yeasts in 
connection with various yeast two-hybrid systems, and cites numerous relevant 
references. These and other methods known in the art can all be used for purposes of the 
5 present invention. The application of such methods to the present invention should be 
apparent to a skilled artisan apprised of the present disclosure. 

Generally, each of the two chimeric genes (one having an N-intein coding 
sequence and the other having a C-intein coding sequence) of the present invention is 
included into a separate expression vector (bait vector and prey vector). Both vectors can 

10 be co-transformed into a single yeast host cell. As will be apparent to a skilled artisan, it 
is also possible to express both chimeric genes from a single vector. In a preferred 
embodiment, the bait vector and prey vector are introduced into two haploid yeast cells of 
opposite mating types, e.g., a-type and a-type, respectively.. The two haploid cells can be 
mated at a desired time to form a diploid cell expressing both chimeric genes. 

15 Generally, the bait and prey vectors for recombinant expression in yeasts include 

a yeast replication origin such as the 2|i origin or the ARSH4 sequence for the replication 
and maintenance of the vectors in yeast cells. Preferably, the vectors also have a bacteria 
origin of replication (e.g., ColEl) and a bacteria selection marker (e.g., amp^ marker, i.e., 
bla gene). Optionally, the CEN6 centromeric sequence is included to control the 

20 replication of the vectors in yeast cells. Any constitutive or inducible promoters capable 
of driving gene transcription in yeast cells may be employed to control the expression of 
the chimeric genes. Such promoters are operably linked to the chimeric genes. Examples 
of suitable constitutive promoters include but are not limited to the yeast ADHl , PGKl , 
TEF2 , GPDl , HIS3, and CYCl promoters. Example of suitable inducible promoters 

25 include but are not limited to the yeast GALl (inducible by galactose), CUPl (inducible 
by .Cu*^"^), MELl (inducible by galactose), FUSl (inducible by pheromone) promoters; the 
AOX/MOX promoter from H. polymorpha and P. Pastoris (repressed by glucose or 
ethanol and induced by methanol); chimeric promoters such as those that contain LexA 
operators (inducible by LexA-containing transcription factors); and the like. Inducible 

30 promoters are preferred when the fusion proteins encoded by the chimeric genes or the 
reporter proteins resulting from protein trans-splicing are toxic to the host cells. If it is 
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desirable, certain transcription repressing sequences such as the upstream repressing 
sequence (URS) from SP013 promoter can be operably linked to the promoter sequence, 
e.g., linked to the 5' end of the promoter region. Such upstream repressing sequences 
function to fine-tune the expression level of the chimeric genes. 
5 Preferably, a transcriptional termination signal is operably linked to the chimeric 

genes in the vectors. Generally, transcriptional termination signal sequences derived 
from, e.g., the CYCl mdADHl genes can be used. 

Additionally, it is preferred that the bait vector and prey vector contain one or 
more selectable markers for the selection and maintenance of only those yeast cells that 

10 harbor the chimeric genes of the present invention. Any selectable markers known in the ' 
art can be used for purposes of this invention so long as yeast cells expressing the 
chimeric gene(s) of the present invention can be positively identified or negatively 
selected. Examples of markers that can be positively identified are those based on color 
assays, including the lacZ gene which encodes P-galactosidase, the firefly luciferase 

15 gene, secreted alkaline phosphatase, horseradish peroxidase, the blue fluorescent protein 
(BFP), and the green fluorescent protein (GFP) gene (see Cubitt et al, Trends Biochem, 
ScL, 20:448-455 (1995)). Other markers emitting fluorescence, chemiluminescence, UV 
absorption, infrared radiation, and the like can also be used. Among the markers that can 
be selected are auxotrophic markers that include, but are not limited to, URA3, HISS, 

20 TRPl, LEU2, LYS2, ADE2, and the like. Typically, for purposes of auxotrophic selection, 
the yeast host cells transformed with bait vector and/or prey vector are cultured in a 
medium lacking a particular hutrieiiit. Other selectable markers are not based on 
auxotrophies, but rather on resistance or sensitivity to an antibiotic or other xenobiotics. 
Examples include, but are not limited to, chloramphenicol acetyl transferase (CAT) gene, 

25 which confers resistance to chloramphenicol; CANl gene, which encodes an arginine 
permease and thereby renders cells sensitive to canavanine (see Sikorski et ah, Meth. 
EnzymoL, 194:302-318 (1991)); the bacterial kanamycin resistance gene (kan^), which 
renders eucaryotic cells resistant to the aminoglycoside G418 (see Wach et aL, Yeast, 
10:1793-1808 (1994)); and CYH2 gene, which confers sensitivity to cycloheximide (see 

30 Sikorski et aL, Meth. EnzymoL, 194:302-318 (1991)). In addition, the CUPl gene, which 
encodes metallothionein and thereby confers resistance to copper, is also a suitable 
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selection marker. Each of the above selection markers may be used alone or in 
combination. One or more selection markers can be included in a particular bait or prey 
vector. The bait vector and prey vector may have the same or different selection markers. 
In addition, the selection pressure can be placed on the transformed host cells either 
5 before or after mating the haploid yeast cells. 

As will be apparent, the selection markers used should complement the host 
strains in which the bait and/or prey vectors are expressed. In other words, when a gene 
is used as a selection marker gene, a yeast strain lacking the selection marker gene (or 
having mutation in the corresponding gene) should be used as host cells. Numerous yeast 

10 strains or derivative strains corresponding to various selection markers are known in the 
art. Many of them have been developed specifically for certain yeast two-hybrid 
systems. The application and optional modification of such strains with respect to the 
present invention should be apparent to a skilled artisan apprised of the present 
disclosure. Methods for genetically manipulating yeast strains using genetic crossing or 

15 recombinant mutagenesis are well known in the art. See e.g., Rothstein, Meth. Enzymol, 
101:202-211 (1983). By way of example, the following yeast strains are well known in 
the art, and can be used in the present invention upon necessary modifications and 
adjustment: 

L40 strain which has the genotype MATa his3A200 trpl'901 Ieu2-3J12 adel 
20 LYS2::(lexAop)4'HIS3 URA3::(lexAop)84acZ\ 

EGY48 strain which has the genotype MATa trpl his 3 ura3 6opS'LEU2; and 
MaV103 strain which has the genotype MATa ura2-52 leu2-3,112 trpl -901 
his3A200 ade2-101 gal4A galSOA SPAL10::URA3 GALl::HIS3::lys2 (see Kumar et ai, 
J. Biol. Chem. 272:13548-13554 (1997); Vidal et al, Proc. Natl. Acad. Sci. USA, 
25 93:10315-10320 (1996)). Such strains are generally available in the research community, 
and can also be obtained by simple yeast genetic manipulation. See. e.g., The Yeast Two- 
Hybrid System, Bartel and Fields, eds., pages 173-182, Oxford University Press, New 
York, NY, 1997. 

In addition, the following yeast strains are commercially available: 
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Y190 Strain which is available from Clontech, Palo Alto, California andhas the 
genotype MATa gal4 gal80 his3A200 trpl-901 ade2'101 ura3'52 leu2-3, 112 
URA3::GAL14acZ LYS2::GAL1'HIS3 cyh'\ and 

YRG-2 Strain which is available from Stratagene, La JoUa, California and has the 
5 genotype MATa ura3-52 his3-200 ade2'101 lys2-801 trpl-901 leu2'3, 112 gal4-542 
gal80^538LYS2::GALl'HIS3 URA3::GALl/CYC14acZ. 

In fact, different versions of vectors and host strains specially designed for yeast 
two-hybrid system analysis are available in kits from commercial vendors such as 
Clontech, Palo Alto, California and Stratagene, La Jolla, California, all of which can be 

10 modified for use in the present invention. 

As described above, each of the two fusion proteins should be designed such that 
the interaction between the first and second test polypeptides is determinable by detecting 
or measuring changes in the reporter in the assay system. It will be apparent from the 
above discussion, the reporter can be any molecules or moieties so long as changes in the 

15 reporter that are specifically associated with intein-mediated trans-splicing are detectable. 
It will be recognized that although the reporters and selection markers can be of similar 
types and used in a similar manner in the present invention, the reporters and selection 
markers should be carefully selected in a particular detection assay such that they are 
distinguishable from each other and do not interfere with each other's roles. 

20 Conveniently, the occurrence of trans-splicing can be detected by detecting 

changes in the size of the reporter. For example, the sizes of the various components of 
the fusion proteins can be designed such that the "active reporter," which is generated 
when the "inactive reporter" is simply cleaved off from one of the fusion proteins or 
recombined with one or more other components of the fusion proteins, is distinguishable 

25 from its precursor(s) and other .trans-splicing products based on size, i.e., molecular 
weight. The inactive reporter can be pre-labeled with, e.g., radioactive isotope or 
fluorescence or other detectable markers, and the active reporter can be detected in, e.g., 
gel electrophoresis either before or after purification. Purification can be based on 
specific affinity columns using an antigen-specific protein, e.g., light-chain 

30 inamunoglobulin, heavy-chain immunoglobulin, avidin, streptavidin, protein A, and 
antigenic peptides. Conveniently, the commonly used and conunercially available 
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epitope tags may be used as size-based reporters. Such epitope tags include sequences 
derived from, e.g., influenza virus hemagglutinin (HA), Simian Virus 5 (V5), 
polyhistidine (6xHis), c-myc, lacZ, GST, and the like. For example, proteins with 
polyhistidine tags can be easily detected and/or purified with Ni affinity columns. One 
5 advantage for using such epitope tags is that specific antibodies to many of these epitope 
tags are generally conunercially available. Alternatively, an epitope-specific antibody 
specifically to the "active reporter" can be used to detect the level of the active reporter 
generated in the assay without purification. 

In another embodiment, the fusion proteins are designed such that the active 

10 reporter produced during intein-mediated trans-sphcing can be detected by a color-based 
assay. For example, when an N-terminal portion of the lacZ protein (P-galactosidase) is 
fused to the N-terminus of an N-intein in a fusion protein and a C-terminal portion of the 
lacZ protein is fused to the C-terminus of a C-intein in another fusion protein, protein 
trans-splicing will religate the N- and C-terminal portions of the lacZ protein to form a 

15 full-length complete and active lacZ protein. Thus, in the presence of a substrate for P- 
galactosidase (e.g., X-Gal, i.e., 5-bromo-4-chloro-3-indolyl-p-D-galactoside), the trans- 
splicing can be detected based on appearance of a blue color or by quantitative 
colorimetric assay. To produce the chimeric genes in this embodiment of the invention, 
the lacZ gene encoding p-galactosidase can be divided into a 5' portion and a 3' portion 

20 in any manner to encode an N-terminal portion and a C-terminal portion of the p- 

galactosidase. As discussed above, it may be advantageous to facilitate protein splicing if 
the first amino acid inmiediately following C-intein is cysteine, serine, or threonine. 
Thus, if at all possible, the division of the lacZ gene is made inmiediately before a genetic 
codon for cysteine, serine, or threonine such that the first amino acid in the C-terminal 

25 portion of p-galactosidase immediately following a C-intein in a fusion protein is one of 
the three preferred amino acids. Certain mutations may also be introduced into the lacZ 
gene to substitute a cysteine, serine or threonine for another amino acid, or for any other 
purposes, so long as the mutation does not adversely interfere with protein trans-splicing 
or the detection of the active reporter protein, i.e., P-galactosidase. 

30 As will be apparent, many other reporters can be used in a similar manner in the 

present invention. Such other reporters include, for example, the green fluorescent 
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protein (GFP), which can be detected by fluorescence assay and sorted by flow-activated 
cell sorting (FACS) (See Cubitt et al. Trends Biochem, ScL, 20:448-455 (1995)), secreted 
alkaline phosphatase, horseradish peroxidase, the blue fluorescent protein (BFP), and 
luciferiase photoproteins such as aequorin, obeUn, mnemiopsin, and berovin {See U.S. 
5 Patent No. 6,087,476, which is incorporated herein by reference). 

In another embodiment, an auxotrophic factor is used as a reporter in an in vivo 
assay in a host strain deficient in the auxotrophic factor. Thus, suitable auxotrophic 
reporter genes include, but not are limited to, URA3, HIS3, TRPl, LEU2, LYS2, ADE2, 
and the like. For example, yeast cells containing a mutant URA3 gene can be used as 

10 host cells (Ura phenotype) for the in vivo assay as illustrated in Figure 4. Such cells lack 
f//?Ai-encoded functional orotidine-5' -phosphate decarboxylase, an enzyme required by 
yeast cells for the biosynthesis of uracil. As a result, the cells are unable to grow on a 
. medium lacking uracil. However, wild-type orotidine-5' -phosphate decarboxylase 
catalyzes the conversion of a non-toxic compound 5-fluoroorotic acid (5-FOA) to a toxic 

15 product, 5-fluorouracil. Thus, yeast cells containing a wild-type URA3 gene are sensitive 
to 5-FOA and cannot grow on a medium containing 5-FOA. Therefore, when an N- 
terminal portion of the [//?A5-encoded protein (orotidine-5 '-phosphate decarboxylase) is 
fused to the N-terminus of an N-intein in a fusion protein and a C-terminal portion of the 
j7/?A3-encoded protein is fused to the C-terminus of a C-intein in another fusion protein, 

20 protein trans-splicing initiated by interaction between the test proteins in the fusion 
proteins will result in ligation of the N- and C-terminal portions of the f//?A5-encoded 
protein, thereby forming a' full-length, complete, and active orotidine-5 '-phosphate 
decarboxylase. This enables the Ura' Foa^ yeast cells to grow on a uracil deficient 
medium (SC-Ura plates). However, such cells will not survive on a medium containing 

25 5-FOA. Therefore, protein trans-splicing events and interactions between test proteins 
• can be detected based on cell growth. 

Additionally, antibiotic resistance reporters can also be employed in a similar 
manner. In this respect, host cells sensitive to a particular antibiotics is used. Antibiotics 
resistance reporters include, for example, chloramphenicol acetyl transferase (CAT) gene 

30 and the kan^ gene, which confers resistance to G418 in eucaryotes and to kanamycin in 
prokaryotes. 
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In yet another embodiment of the present invention, the fusion proteins are 
designed such that intein-mediated trans-splicing produces an active reporter that is a 
transcriptional activator or repressor capable of activating or repressing the expression of 
a detectable gene. Thus, the trans-splicing event will be detected based on the expression 
5 or suppression of the detectable gene. In this embodiment, a "reporting vector" 

containing the detectable gene operably linked to a transcriptional regulatory sequence is 
also introduced into the host cells. The above-described selection markers and reporter 
genes can all be used as the detectable gene for this purpose, so long as activation or 
suppression of the expression of the detectable gene is readily detectable. For example, 

10 as illustrated in Figure 5, the URA3 gene can be used as a detectable gene in connection 
with either a transcriptional activator or suppressor. (An activator is shown in Figure 5.) 
The URA3 gene is operably linked to a transcriptional regulatory sequence responsive to 
the transcriptional activator or suppressor. When the active reporter generated in trans- 
splicing is an activator, the yeast host cells (Ura") grow on a uracil deficient (SC-Ura) 

15 medium and the interaction between the test proteins is detected based on yeast colony 
formation on the medium. Alternatively, when the active reporter generated in trans- 
splicing is a suppressor, the yeast host cells (Ura') grow on a medium containing 5- 
fluoroorotic acid (5-FOA). In the absence of an interaction between the test proteins, the 
URA3 gene is expressed, and the 5-FOA is converted by the URA3 gene product into a 

20 toxic substance, which inhibits the growth of the host cells. In the presence of an 
interaction between the test proteins, a suppressor is generated and the URA3 gene 
expression is shut off. As a result, yeast colonies can be formed on a medium containing 
5-FOA. The transcriptional regulatory sequence is designed such that the detectable gene 
^ is specifically responsive to the active reporter. Alternatively, a suitable detectable gene 

25 integrated in a chromosome of a host cell can also be used. 

Suitable transcription activators include, but are not limited to, GAL4, GCN4, 
ARDl, the human estrogen receptor, E, coli LexA protein, herpes simplex virus VP16 
(Triezenberg et a/.. Genes Dev. 2:718-729 (1988)), the E. coli B42 protein (acid blob, see 
Gyuris et al. Cell 75:791-803 (1993)), NF-kB p65, and the like. In addition, hybrid 

30 transcriptional activators composed of a DNA binding domain from one transcriptional 
activator and an activation domain from another transcriptional activator are also useful. 
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Examples of transcription suppressors include the Kruppel protein, the engrailed protein, 
the knirps protein, the paired protein and the even-skipped protein, all from Drosophila; 
the SINS, GAL80, and TUPl proteins, all from Saccharomyces cerevisiae\ the tet 
repressor; the Egr-1, WTl, RARa, KRAB, verbA, YYl, ADEIB, E4B4, SCIP, kid-1, 
5 Znf2, and kox-1 proteins; and the like. The corresponding transcriptional elements 

specifically interacting with the transcriptional activators or repressors are well known in 
the art. See. e.g., Hanna-Rose and Hansen, Trends. Genet., 12:229-234 (1996). 

Thus, a transcriptional activator or repressor protein can be divided into an N- 
terminal portion and a C-terminal portion which are fused to the N-terminus of N-intein 

10 and C-terminus of C-intein, respectively. Upon protein trans-splicing, a full-length 

protein emerges as a functional transcriptional activator or repressor which subsequently 
activates or represses the expression of the detectable gene in the reporting vector. See 
Figure 5. It is recognized that the interaction between the test proteins may bring the two 
portions of the transcriptional activator or suppressor together which may be sufficient to 

15 initiate or suppress the transcription of the detectable gene. In this respect, this specific 
embodiment of the present invention may be similar to the classic yeast two-hybrid 
system. However, unlike the classic transcription-based yeast two-hybrid system, it is 
possible in the present invention to produce an active transcriptional activator or 
suppressor that is authentic. Thus, the fusion proteins need not be transported into cell 

20 nucleus, since the transcriptional activator or suppressor, once formed during protein 
trans-splicing, is competent for translocation to the nucleus. Indeed, the method of the 
present invention enables use of mitochondrial transcription factors as reporters. Once 
formed by protein trans-splicing, such reporters can translocate to the mitochondria, 
where they can activate or suppress transcription of radtochondrially encoded, detectable 

25 genes. 

The method of the present invention for detecting protein-protein interactions can 
also be used to screen an expression library or applied in the so-called "interaction 
mating." Methods for constructing activation domain or DNA binding domain fusion 
hbraries and the use thereof in yeast two-hybrid system are well known in the art and are 
30 disclosed in e.g., Vojtek et a/., in The Yeast Two-Hybrid System, Bartel and Fields, eds., 
pages 29-42, Oxford University Press, New York, NY, 1997; Zhu et al, in The Yeast 
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Two-Hybrid System, Bartel and Fields, eds., pages 73-96, Oxford University Press, New 
York, NY, 1997. Interaction mating is disclosed in U.S. Patent Nos. 6,057,101 and 
6,083,693; and Finley and Brent, in The Yeast Two-Hybrid System, Bartel and Fields, 
eds., pages 197-214, Oxford University Press, New York, NY, 1997. The methods 
5 described in the above references can all be applied to the present invention upon . 

appropriate modifications. By way of example, N-intein fusion libraries can be prepared 
using an expression vector containing a 5' portion of a reporter gene operably linked to 
the 5' end of N-intein coding sequence. Operably linked to the 3' end of the N-intein 
coding sequence is a multiple cloning site into which various random or predetermined 

10 (e.g., cDNAs) DNA sequences can be inserted in frame. The DNA library thus prepared 
can be transformed into appropriate yeast cells. In this yeast library, an array of fusion 
proteins can be expressed, with each fusion protein containing an N-terminal portion of 
the reporter protein fused to the N-terminus of the N-intein and a random or 
predetermined polypeptide fused to the C-terminus of the N-intein. Appropriate yeast 

15 cells expressing a fusion protein including a bait protein fused to the N-terminus of a C- 
intein and the C-terminal portion of the reporter protein fused to the C-terminus of the C- 
intein can be used to screen the yeast N-intein fusion library to identify prey proteins 
capable of interacting with the bait protein. 

C-intein fusion libraries can also be established and used in "interaction mating" 

20 with the N-intein fusion libraries. In this way, interacting protein pairs can be identified 
and genes encoding such proteins are isolated. 

In yet another embodiment of the detection method of the present invention, the 
detection assay is used to detect interactions between three or more agents in a trimeric or 
higher order complex. See U.S. Patent No. 5,695,941; Chang et al, Cell 79:131-141 

25 (1994); Tirode et a/., / Biol Chem., 272:22995-22999 (1997); Van Criekinge et aL 

Anal Biochem., 263:62-66 (1998); and Pause et al, Pore. Natl Acad. ScL USA, 96:9533- 
9538 (1999), all of which are incorporated herein by reference. Essentially, the above- 
described detection assay of this invention involving two fusion proteins is conducted in 
the presence of one or more other test polypeptides. In this manner, interactions between 

30 the iv/p test polypeptides in the fusion proteins that require the participation of the other 
test polypeptides can be' detected. 
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The other test polypeptides can be small molecule ligands that interact with the 
test polypeptides in the fusion proteins. Many protein-protein interactions require the 
presence of a small molecule ligand, which becomes an integral part of the assembly 
formed by the protein interactions. See Berlin, in The Yeast Two-Hybrid System, Bartel 
5 and Fields, eds., pages 259-272, Oxford University Press, New York, NY, 1997. For 
example, immune suppressants such as cyclosporin A (CsA), FK506, and rapamycin are 
known to bind with high affinity to inmiunophilins forming protein-drug complexes 
which, in turn, bind to specific target proteins to inhibit their activities. Classic yeast 
two-hybrid system has been employed successfully to isolate proteins interacting with the 

.10 FKBP12/rapamycin complex. See, e,g., Chiu et al, Proc. Nat. Acad. Sci. USA, 

91:12574-12578 (1994). A multi-hybrid assay in accordance with the present invention 
can be conducted both in vitro and in vivo. In an in vitro assay, the small molecule 
ligands are simply added to the above-described intein-based two-hybrid assay system of 
the present invention. In an in vivo assay it is necessary that the small molecule ligands 

15 are taken-up by the host cells. While many host cells are able to take up various small 
molecule ligands, certain host cells can also be manipulated to increase the uptake of 
small molecule ligands. For example, yeast high uptake mutants such as erg6 mutant 
strains can facilitate the uptake of the test compounds by yeast cells. See Gaber et ah, 
Mol Cell BioU 9:3447-3456 (1989). 

20 Many protein interactions require the participation of other proteins. Thus, the 

other test polypeptides in the multi-hybrid assay of the present invention can also be - 
proteins. Accordingly, genes encoding test proteins other than those in the intein- 
containing fusion proteins can be co-expressed in host cells with the chimeric genes as 
described above. Such additional genes may be incorporated into one of the bait or prey 

25 vector or the reporting vector. Altematively, they can be expressed in separate vectors 
under control of a constitutive or inducible promoter. 

In a specific embodiment, the additional test proteins are enzymes capable of 
post-translationally modifying at least one of the test polypeptides in the intein- 
containing fusion proteins of the present invention. See Figure 6. This is especially 

30 useful when one or both of the test proteins in the intein-containing fusion proteins are 
beheved to contain consensus sequences for certain modifying enzymes. A two-hybrid 
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system involving modifying enzymes has been disclosed in, e.g., U.S. Patent No. 
5,637,463, which is incorporated herein by reference. This system can be applied to the 
present invention upon appropriate modifications as will be apparent to a skilled artisan 
apprised of the present disclosure. Examples of useful modifying enzymes include 
5 protein kinases which catalyze protein phosphorylation (e.g., serine/threonine 

phosphorylation, tyrosine phosphorylation by tyrosine kinase, see Lioubin et al, Genes 
Dev., 10:1084-1095 (1996)); Keegan etaL, Oncogene, 12:1537-1544 (1996)), fatty acid 
acylation, ADP-ribosylation, myristylation, and glycosylation. The modifying enzymes 
can be co-expressed in the host cells with the intein-containing fusion proteins. It is 

10 recognized that over-expression of certain modifying enzymes such as tyrosine kinases 
may be toxic to host cells. This can be avoided by using inducible promoters or weak 
promoters to drive expression of the toxic modifying enzymes in host cells. 

In yet another aspect of the present invention, a kit is provided comprising various 
vectors and reagents described above. The kit will provide users some convenience in 

15 practicing the various embodiments of the present invention. In particular, the kit can be 
used in detecting and/or characterizing protein-protein interactions. Accordingly, 
components that can be included in the kit will be apparent to a skilled artisan apprised of 
the present disclosure. Specifically, any vectors, reagents, and the hke described above 
in connection with various embodiments of the present invention can be included in the 

20 kit. Typically, the various components of the^kit are placed in a rack, compartmentalized 
support or enclosed container for purposes of organizing and/or transporting the kit. 

In a specific embodiment, the kit includes at least a pair of expression vectors. 
One expression vector contains a chimeric gene operably linked to a transcription 
regulatory sequence. The chimeric gene includes a DNA sequence encoding an N-intein 

25 and a multiple cloning site (MCS). The multiple cloning site is operably linked to the N- 
intein coding sequence such that a DNA sequence encoding,a test polypeptide of interest 
can be conveniently inserted in frame into the MCS and a fusion protein can be produced 
containing the N-intein and the test polypeptide. Likewise, the other expression vector 
also contains a transcription regulatory sequence operably linked to a chimeric gene 

30 which includes a DNA sequence encoding a C-intein and a multiple cloning site (MCS). 
The multiple cloning site is operably linked to the C-intein coding sequence such that a 
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DNA sequence encoding another test polypeptide of interest can be conveniently inserted 
in frame into the MCS and a fusion protein can be produced containing the C-intein and 
the test polypeptide. One or both of the chimeric genes further contain an operably 
linked DNA sequence encoding an inactive reporter protein capable of being converted to 
an active reporter protein upon trans-splicing mediated by the N-intein and the C-intein. 
Various arrangements of the chimeric genes can be used, as will be apparent from the 
discussions below in connection with the method for detecting protein-protein 
interactions of the present invention. In a preferred embodiment, specially selected 
and/or modified coding sequences for the N-intein and C-intein are used such that the N- 
intein and C-intein do not significantly interact with one another. 

The expression vectors may also include other components as described above in 
connection with the bait vectors and prey vectors of the present invention. For example, 
the expression vectors may contain elements necessary for the replication of the vector in 
a host cell, the correct transcription and translation of the chimeric gene (e.g., promoters 
and other transcriptional regulatory elements, transcription termination signal, etc.). The 
vectors preferably also contain a selection marker gene for selecting and maintaining only 
those host cells harboring the vectors. 

For application in an intein-based multi-hybrid system of the present invention, 
the kit may further include one or more additional expression vectors each containing a 
gene encoding a test protein, e.g., a modifying enzyme (e.g., protein kinase, enzymes 
catalyzing glycosylation, ribosylation, myristalization, etc.). The gene may be placed 
under control of a constitutive or inducible promoter. 

When the reporter protein is a transcription activator or suppressor, the kit may 
further comprise a reporting vector. As described above, the reporting vector contains a 
detectable gene under the control of a promoter specifically activated or repressed by the 
activator or suppressor, respectively. 

In addition, the kit of the present invention can also comprise one or more types 
of host cells, for example, yeast host strains for the expression of the chimeric genes and 
other genes. Preferably, yeast strains of opposite yeast mating types (a and a) are 
provided. The yeast strains should have genotypes suitable for the selection of the 
various vectors based on the selection marker genes in the vectors, and suitable for the 
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detection of the active reporter generated in the host strains as a result of intein-mediated 
protein trans-splicing. Optionally, E. coli strains for the amplification of the various 
vectors are also provided in the kit. 

Additionally, the kit may include instructions for using the kit to practice the 
present invention. The instructions should be in writing in a tangible form or stored as an 
electronically retrievable form. 

As will be apparent to a skilled artisan, any arrangements of the components in 
the fusion proteins of the present invention can be adopted so long as the protein trans- 
splicing mediated by the N- and C-intein and initiated by a specific interaction between 
the test polypeptides can be detected by measuring the active reporter produced during 
the protein splicing process. 

In one embodiment, as shown in Figure 3A, one fusion protein has a first test 
polypeptide X fused or conjugated to the C-terminus of an N-intein, while the other 
fusion protein has a second test polypeptide Y fused to the N-terminus of a C-intein and a 
reporter R (inactive) fused to the C-terminus of the C-intein. Upon tans-splicing, the 
reporter is excised off and becomes a free detectable active reporter R*. 

In another embodiment, as shown in Figure 3B, one fusion protein has a first test 
polypeptide X fused to the C-terminus of an N-intein and a reporter R (inactive) fused to 
the N-terminus of the N-intein. The other fusion protein includes a second test 
polypeptide Y fused to the N-terminus of a C-intein. After trans-splicing mediated by the 
N- and C-intein, a detectable free active reporter R* is released. 

Figure 3C illustrates the fusion protein arrangement in another embodiment of the 
invention. The first fusion protein consists of a first portion of a reporter R (Ri) fused to 
the N-terminus of an N-intein and a first test polypeptide (X) fused to the C-terminus of 
the N-intein. The second fusion protein consists of a second test polypeptide (Y) fused to 
the N-terminus of a C-intein and the remaining portion of the reporter R (R2) fused to the 
C-terminus of the C-intein. In this manner, upon intein-directed trans-splicing, the two 
portions of the reporter R are ligated together thus forming a detectable active reporter R. 

Figure 3D is a diagram showing the fusion proteins design in yet another 
embodiment of the present invention. The first fusion protein consists of a first test 
polypeptide (X) fused to a first portion of a reporter R (RO which in turn is fused to the 
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N-terminus of an N-intein. The second fusion protein consists of a C-intein, the 
remaining portion of the reporter R (R2) fused to the C-terminus of a C-intein, and a 
second test polypeptide (Y) fused to R2. If the test polypeptides X and Y interact with 
each other to bring the N-intein and C-intein close together, trans-splicing will result in a 
5 detectable construct X-R-Y. 

Yet another arrangement of the fusion proteins is demonstrated in Figure 3E. The 
first construct is composed of a first portion of a reporter R (Ri) fused to the N-terminus 
of an N-intein and a test polypeptide (X) fused to the C-terminus of the N-intein. The 
second construct has a C-intein, the remaining portion the reporter R (R2) fused to the C- 

10 terminus of the C-intein, and another test polypeptide (Y) fused to R2. Assuming test 
polypeptides X and Y interact with each other, thus bringing the N-intein and C-intein 
close together, trans-splicing can occur resulting in a detectable construct R-Y. 

Figure 3F illustrates yet another possible arrangement of the fusion proteins in the 
present invention. As shown in Figure 3F, the first fusion protein has a test polypeptide 

15 (X) fused to a first portion of a reporter R (Ri) which is in turn fused to the N-terminus of 
an N-intein. The second fusion protein includes another test polypeptide (Y) fused to the 
N-terminus of a C-intein and the remaining portion of the reporter R (R2) fused to the C- 
terminus of the C-intein. Assuming test polypeptides X and Y interact with each other, 
thus bringing the N-intein and C-intein close together, trans-splicing can occur resulting 

20 in a detectable construct X-R. 

As is apparent from the above description, the present invention provides a 
powerful, versatile, intein-based yeast two-hybrid system for detecting and characterizing 
protein-protein interactions. The system can be used easily adapted to high-throughput 
screening procedures. In particular, sensitive genetic selection assays can be 

25 conveniently incorporated into the system using yeast cells. Detection of protein-protein 
interaction is based on intein-mediated protein trans-splicing, which is independent of 
other cellular factors. As a result, the system is useful in detecting biologically relevant 
protein-protein interactions that occur in any intracellular compartment or even 
extracellularly. For example, interactions between two nuclear proteins, between 

30 between a cytosolic and a membrane-bound protein, between two mitochondrial proteins, 
between an extracellular and a membrane-bound protein, or between two extracellular 
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proteins can be detected. In addition, protein trans-splicing typically results in changes in 
protein structures and functions and formation of free new proteins. As a result, various 
methods available in the art for detecting changes in protein structures and functions can 
be incorporated into the system allowing great flexibility in fine tuning and optimizing 
the system, and adapting the system to various applications. 

The present invention will be further described by way of the following examples, 
which are not intended to limit the invention in any manner. Standard techniques well 
known in the art or the techniques specifically described below were utilized. 

EXAMPLE 

To test an intein-based two hybrid strategy, we constructed 4 vectors that allow 
expression of different fusion proteins (see Figure 8): 

1. Mp779. Heterologous sequences can be cloned into a polylinker that permits 
expression of heterologous protein fragments as a C-terminal fusion to Ura3p and 
intein fragments. Specifically, the fusion protein encoded by an Mp779-based 
expression plasmid (designated Mp779-X) will consist of the following 
fragments, listed from the amino to the carboxy terminus: 

• residues 1 to 195 of UraSp; 

• residues 283 to 557 of the VMAl primary translation product; 

• heterologous residues (designated X) of one of two interacting proteins. 

2. Mp783. Heterologous sequences can be cloned into a polylinker that permits 
expression of heterologous protein fragments as an N-terminal fusion to intein 
and Ura3p fragments. Specifically, the fusion protein encoded by an Mp783- 
based expression plasmid (designated Mp783-Y) will consist of the following 
fragments, listed from the amino to the carboxy terminus: 

• heterologous protein fragment (designated Y) that interacts with X; 

• residues 559 to 738 of the VMAl primary translation product; 

• residues 196 to 267 (the genuine C-terminus) of Ura3p 

3. Mp778. Heterologous sequences can be cloned into a polylinker that permits 
expression of heterologous protein fragments as a C-terminal fusion to Ura3p and 
intein fragments. Specifically, the fusion protein encoded by an Mp778-based 
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expression plasmid (designated Mp778-X) will consist of the following 
fragments, listed from the amino to the carboxy terminus: 

• residues 1 to 189 of Ura3p; 

• residues 283 to 557 of the YMAl primary translation product; 
5 • heterologous residues (designated X). 

4. Mp782. Heterologous sequences can be cloned into a polylinker that permits 
expression of heterologous protein fragments as an N-terminal fusion to intein 
and Ura3p fragments. Specifically, the fusion protein encoded by an Mp782- 
based expression plasmid (designated Mp782-Y) will consist of the following 
10 fragments, listed from the amino to the carboxy terminus: 

• heterologous protein fragment (designated Y) that interacts with X; 

• residues 559 to 738 of the YMAl primary translation product; 

• residues 196 to 267 (the genuine C-terminus) of Ura3p 

' Using these vectors and the human genes encoding the interacting proteins BclX 
15 and Bad, we constructed the following expression plasmids: 
L Mp778-BclX 

2. Mp778-Bad 

3. Mp782-BclX 

4. Mp782-Bad 
20 5. Mp779-BclX 

6. Mp779-Bad 

7. Mp783-BclX 

8. Mp783-Bad. 

Yeast (genotype: his3A200 leulAO metlSAO trplA63 uraSAO) were transformed 
25 with combinations of these expression plasmids and their parental vectors to test for 
reconstitution of Ura3p activity that was dependent on BclX-Bad association. Two 
independent clones from each transformation were streaked onto media selective for 
Ura3p activity (SC~His-Trp-Ura) or selective only for the presence of the plasmids 
(SC-His-Trp). As shown in Figure 9, yeast transformed with pairs of plasmids encoding 
30 fusion proteins that could, presumably via protein splicing, reconstitute full length Ura3p 
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exhibited uracil prototrophy. Specifically, yeast co-transformed with the following 
plasmids could grow on uracil-deficient media: 

• Mp778^BclX and Mp782-Bad 

• Mp778-Bad and Mp782-BclX 
5 • Mp779-BclX and Mp783-Bad 

• Mp779-Bad and Mp783-BclX 

A cartoon of the protein-proteiii interactions that are presumed to give rise to 

functional UraSp is shown in Figure 10. Notably, the uracil prototrophy was independent 

of "orientation" of the two-hybrid interaction; that is, it was seen whether BclX was 
10 fused to the N-terminal intein fragment and Bad was fused to the C-terminal intein 

fragment or vice versa. No growth was observed when strains lacked either the BclX- or 

Bad-containing fusion. 

All publications and patent applications mentioned in the specification are 

indicative of the level of those skilled in the art to which this invention pertains. All 
15 publications and patent apphcations are herein incorporated by reference to the same 

extent as if each individual publication or patent appUcation was specifically and 

individually indicated to be incorporated by reference. 

Although the foregoing invention has been described in some detail by way of 

illustration and example for purposes of clarity of understanding, it will be obvious that 
20 certain changes and modifications may be practiced within the scope of the appended 

claims. 


36 


